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Modern information and communication technologies, especially the Internet, have diminished the role of 
spatial distances and territorial boundaries on the access and transmissibility of information. This has 
enabled scientists for closer collaboration and internationalization. Nevertheless, geography remains an 
important factor affecting the dynamics of science. Here we present a systematic analysis of citation and 
collaboration networks between cities and countries, by assigning papers to the geographic locations of their 
authors' affiliations. The citation flows as well as the collaboration strengths between cities decrease with the 
distance between them and follow gravity laws. In addition, the total research impact of a country grows 
linearly with the amount of national funding for research & development. However, the average impact 
reveals a peculiar threshold effect: the scientific output of a country may reach an impact larger than the 
world average only if the country invests more than about 100,000 USD per researcher annually. 

The strength of most interactions in nature typically decreases with the distance between objects or consti- 
tuents. The most famous example is Newton's gravitational force, which is known to decay with the square of 
the distance between the masses. This principle holds also outside the realm of physical processes. Recent 
studies on mobile phone communication networks 1 ' 2 and blogs 3 have revealed that the probability for a social tie 
to occur between agents decays with a power of their distance. 

Likewise, scientific interactions are likely to take place between scholars localized in the same or nearby areas. 
Scientists tend to cluster in space, since the elaboration and progress of a project requires frequent discussions 
between collaborators that is hardly possible if they live far apart. Factors based on cultural, linguistic and 
institutional differences cause additional obstacles to long-distance cooperation 4 . Further, research funding is 
mostly allocated at the national level 5 , thus favoring regional over international collaborations. 

Nowadays, the Internet and the greater affordability of international transportation have enormously reduced 
distances between people, overcoming both geographic and cultural barriers 6 " 8 . This in turn has made scientific 
collaborations between distant scholars far easier than before 9 " 14 . Nevertheless, the role of geography in the 
creation and recognition of scientific output is not yet fully known. For example, How do scientific interactions 
depend on distance? Is collaboration concentrated within the perimeter of a university, of a city or of a country, as 
it used to be in the past, or has it become truly international, possibly due to the modern information and 
communication technologies? 

Multi-authored collaborations serve as big opportunity for science 15 , as one can integrate a wide range of 
competence and skill, to attack difficult problems, with an enhanced chance of success. Indeed, the last decades 
have witnessed the formation of larger and larger research teams 1617 . In particular, multi-university collabora- 
tions have been growing at a fast pace and are more likely to lead to high impact publications 18 , especially if they 
involve different countries 19 ' 20 . On the other hand, there is also evidence of decreasing returns from large team 
size, likely from management inefficiencies, which limits the productivity arising from collaboration 21 . 

Geographic proximity is also likely to affect the process of giving and receiving credits for someone's work, 
expressed by paper citations. For most papers one expects to find a decaying probability of citation with distance, 
as new findings are typically more visible in the area where the authors operate. This is confirmed by a recent 
study 22 . In addition, collaboration patterns are likely to influence and be influenced by citations. While collab- 
orating, scholars become more familiar with the scientific output of their co-authors, which then has a higher 
chance to be cited in the future. In turn, scholars citing frequently each other's work have strongly overlap- 
ping research interests, and are more likely to become co-authors sooner or later. Therefore citations and 
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collaborations between distinct locations are likely to be correlated. 
However, it is crucial to assess how collaborative patterns affect 
citation flows, to be able to disentangle the actual impact of a pub- 
lication (and, therefore, its merit) from credits coming through social 
networking. A geographic analysis of citation flows between cities is 
also useful to understand how quickly a new result gets recognized by 
the scientific community in different geographical areas, which may 
help to uncover how new scientific paradigms spread and get estab- 
lished 23 . 

Knowing how scientific interactions vary with distance is also 
valuable for practical reasons. To scholars, it might suggest how to 
choose collaborators in order to optimize the impact and visibility of 
their research. To institutions and governments, it might advice 
suitable allocations of funds for regional and international projects, 
in order to improve the scientific outcome for a given amount of 
resources. It is then not surprising that spatial scientometrics has 
acquired a prominent role during the last few years. There are a 
number of studies carried out exploiting the enhanced availability 
of citation data 24 . Yet there are other factors, namely funding, that 
also plays a crucial role in the development of a research project, as it 
not only contribute towards the direct and overhead costs of the 
research but also facilitates the cooperation and collaboration among 
researchers working in different locations and different fields 25 . Since 
both public and industrial resources are used to fund academic 
research, it is also natural to question the result and impact obtained 
with these resources 26 ' 27 . 

We have performed the first comprehensive study of citation and 
collaborative interactions between different geographic locations. 
We used one of the world's largest citation databases to derive the 
citation and the collaboration network, i.e. weighted networks where 
nodes are cities and links are citations and collaborations between 
the corresponding cities (see Methods). The analysis of these net- 
works 28 " 31 discloses the existence of gravity laws as well as non- trivial 
correlation between collaborations and citations. Finally, we explore 
the issue of the importance of funding to research and development 
in promoting high quality science, by studying the relationship 
between national expenditure, the number of publications and their 
impact in terms of number of citations for different countries. 

Results 

The research contribution of each country in terms of the (normal- 
ized) number of citations received iV Cite is illustrated in the world 
map of Fig. 1 A. Colored maps can be misleading as the value assigned 
to a large area gives an impression of a much greater impact of that 
color in the visualization. We thus created a cartogram, in which the 
geographic regions are deformed and rescaled in proportion to their 
relative research contribution 32 . The citation strengths of countries 
span over seven orders of magnitude. North America and Europe 
receive 42.3% and 35.3% of world's citations, respectively. In con- 
trast, the contribution by Asia amounts to only 17.7% of world's 
citations while the total contribution of Africa, South America and 
Oceania is lower than 5%. In this ranking the United States is the 
leading country followed by the United Kingdom, Germany, Japan, 
and China. The corresponding world map in terms of countries' 
number of (normalized) publications is shown in the Supplemen- 
tary Fig. SI online. This heterogeneity suggests that a small number 
of countries have a substantial contribution to research while the rest 
has a negligible contribution. In Fig. S2 online we report the results 
for the average number of citations of each country. 

In order to find out the quality of papers published by different 
countries we consider the number of citations of each of the papers 
written by that country. In Fig. IB we plot the probability distribution 
of the number of citations of papers in the largest 20 countries. A 
paper is associated to a country if at least one of its affiliations is from 
that country. All these distributions are broad and vary over four 
orders of magnitude. When each distribution is rescaled by the 



average number of citations of papers of the respective country, all 
curves nicely collapse (Fig. 1C). This result suggests that the func- 
tional form of the citation distribution is the same in each country 
and that the difference between countries can be effectively summar- 
ized by the average number of citations. This type of universality 
holds at the level of scientific disciplines as well 33 . 

Next we consider the contribution at the level of cities. In Fig. ID 
we plot the probability distribution of the cities' citations. The dis- 
tribution is broad, spanning over five orders of magnitude, and it 
follows a power law decay with exponent 1.46 ± 0.03. This suggests a 
relationship with the population of the city, as the city size distri- 
bution obeys the Zipf law 34 ' 35 , i.e. decays as a power law (with expo- 
nent 2). The observed power law scaling relation might suggest a 
self- organization phenomena due to the agglomeration benefits in 
science. These advantages can be due to the ease of collaboration 
between groups working in similar fields, sharing of infrastructure 
and support, etc., which leads to efficient integration and transfer of 
information. 

We now consider the weighted citation network between cities, 
where the nodes are the cities that are connected by weighted and 
directed links, indicating publications of one city citing publications 
of the others. The network has 18,199 nodes and 9,494,021 links 
including 14,447 self-links (i.e., citations within the same city). In 
Fig. ID we plot the cumulative distribution of the weights of self-links 
and links between different nodes. Both these distributions are broad; 
however, the weights of self-links are more heterogeneous, revealing 
a bias towards self- citations. Next we calculate the number of incom- 
ing links, i.e., the in-degree k 1 * 1 of each node i and its in-strength, 
s- n = J2jwQ te , which equals the number iVp te of (normalized) cita- 
tions received. By plotting the in-degree against the in-strength, we 
find that there is a power law scaling behavior with (s in )(k in ) oc (k in ) a 
(Fig. IE). However, there are two distinct scaling regimes: for nodes 
with small kf 1 (< 200) the exponent is a = 0.91 ± 0.03 (regression 
coefficient ± standard error of the estimate R = 0.95 ± 0.01), while 
for large kf 1 (> 200) the exponent is a = 2.20 ± 0.08 (R = 2.01 ± 
0.01). The super-linear behavior suggests that stronger links are more 
frequently connected to high in-degree nodes. The out-strength of 
the nodes follows a similar relationship with the outdegree of the 
nodes (see Supplementary Fig. SI online). Finally, we plot the weights 
of the links w^ lte against the product of the node strength s^sj 1 . The 
product s° ut sj n gives the weight of a link that is expected to occur by 
chance between i and; if all the papers would be citing each other at 
random. Even in this case there are two distinct scaling regions, 

w^ ite oc^ ut sj n y, where a = 0.13 ± 0.01 (R = 0.19 ± 0.0003) if 

the product is less than 2 X 10 7 , while for larger values of the product 
a = 0.99 ± 0.01 (R = 1.07 ± 0.001). This suggests that the observed 
citation is as expected between high strength nodes, while it is much 
lower in case of cities with low strength. 

Let us now consider the collaboration network at the city level, 
where the nodes are cities and weighted undirected links indicate the 
presence and frequency of collaborations between scholars of differ- 
ent cities. There are 18,199 nodes in the network and 1,256,718 
undirected links including 14,954 self-links. The weight of the self- 
links indicates the amount of internal collaboration. The degree of a 
node i indicates the number of other cities with which i collaborates 
and its strength is indicative of, but not coincident with, the number 
of papers written by scholars of institutions in that city. 

In Fig. 2A we plot the cumulative probability distribution of link 
weights. As for citations, the weights of self-links are more broadly 
distributed than the weights of the links between different cities, 
showing that scholars of a city collaborate more frequently with each 
other than with colleagues from any other city. The distributions of 
collaboration and citation streams between cities differ from their 
analogues in mobile phone communications and world trade, that 
show log- normal distributions 2 ' 36 . Next, we consider the fraction of 
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Figure 1 | Properties of the world citation network. (A) Citation map of the world where the area of each country is scaled and deformed according to the 
number of citations received, which is also represented by the color of each country. (B) Citation distribution of papers of top 20 countries. If a paper is 
written by authors from multiple countries, the paper contributes to each country. (C) When the distributions in (B) are normalized by the average 
number of citations of each country, they fall on top of each other. (D) Probability distribution function of the number of citations received by each city. 
(E) Cumulative distribution function of the link weights Wy (excluding self-links) and self-links Wa in the citation network of cities. (F) Node in-strength 
against its in-degree for the city citation network. (G) Link weight against the product of the strengths of the connected nodes in the city citation network. 
For each plot we show the corresponding best-fit lines and power law exponents. 

internal collaboration by calculating the ratio of the weight of the indicating that as the city size increases most of its collaborations 
self-link to the strength of the node. By plotting w^° 1 / s t - against the take place within the city (Fig. 2B). However, for small cities most of 
strength of the node s b we see that the ratio increases with s i9 their papers are written with external collaborators. The node degree 
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Figure 2 | Properties of the world collaboration network. (A) Cumulative 
probability distribution of the link weights in the collaboration network of 
cities. Self-links are shown separately. (B) Fraction of internal 
collaboration, indicated by the ratio of the weight wf® 1 of the self- link and 
strength s z of a node, against Sj. (C) Strength of a node against its degree. 
The straight line indicates a power law behavior with exponent 1.66 ± 0.04. 
In these plots we use the same colorbar as in Fig. 1. 

scales with its strength as (s)(k) oc /c a , where a = 1.66 ± 0.04 (R = 
1.65 ± 0.01) (Fig. 2C). This super-linear scaling suggests that higher 
degree nodes are more frequently connected by stronger links. 

Let us explore the relationship between the citation and the col- 
laboration networks at both the country and the city level. At the 
country level the collaboration network comprises 226 nodes and 
10,308 undirected links, including 219 self-links. In the citation net- 
work there are also 226 nodes but 28,869 directed links, including 215 
self-links. In Fig. 3, we plot the weight of links of the collaboration 
network, w^° l against the weight of the same links in the citation 

network, wf* e + wj/ lte . We find scaling wf° l oc ( wf° l + w^j° l J where 

a = 1.04 ± 0.01 (R = 1.08 ± 0.008) for countries (Fig. 3A), and a = 
0.82 ± 0.02 (R = 1.05 ± 0.002) for cities (Fig. 3B), i.e. the increase in 
collaboration is linearly related to the amount of citations exchanged 
between the two countries/cities. 

We now consider the dependence of the number of citations of a 
paper on the number of coauthors of that paper and on the number 
of affiliations of its coauthors. It has been previously shown that 
papers published by teams often get more citations than single author 
papers 1718 . Our results also show that the average number of cites of a 
publication increases with the number of co-authors of that publica- 
tion (Fig. 3C). Furthermore, the average number of citations of a 
publication increases with the number of affiliated countries and 
cities of its authors (Fig. 3D and E). In order to separate the effect 
of the number of coauthors and different type of collaboration 
(internal, domestic and international) we grouped each paper based 
on its affiliations and number of coauthors. In Table 1, we consider 
papers with a given number of authors and categorize them accord- 
ing to whether all the affiliations listed in the paper are from a single 
city, from multiple cities in a single country or from different coun- 
tries. For an equal number of authors, publications having multiple 
international affiliations get a statistically significant increment (p < 
10" 4 ) in the number of citations with respect to publications with 
only domestic affiliations. Thus, crossing territorial boundaries also 
pays off in terms of scientific impact. In contrast, multiple domestic 
affiliations do not positively effect the number of citations when the 
number of authors in a publication is less than 6. 

Next we consider the effect of geographical proximity on the cita- 
tion and collaboration networks by determining the geographic loca- 
tion (latitude and longitude) of each place in the dataset 37 (see 
Methods). We found that the probability that there is a link between 
two cities in the collaboration network decreases as a power law as 
the distance between the two cities increases (Fig. 4A). The power law 
exponent is 0.57 ±0.01. Our results are different from those obtained 
in Ref. 38, where it was found that the distribution of distances 
between co-authors decreases exponentially. Such difference might 
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Figure 3 | Correlation between the world citation and collaboration 
networks. Weight of the links in the citation network against the 
corresponding links in the collaboration network at the (A) country level 
and (B) city level network. Power law scaling is shown by solid lines with 
exponents 1.04 ± 0.01 and 0.82 ± 0.02, respectively. Density plot of the 
number of citations of a publication against the number of (C) co-authors, 
(D) countries (E) cities in the affiliation. The circles indicate the average 
trend. 

be due to the limited dataset used in Ref. 38, which included only 
papers published before 1990, and possibly also due to the recent 
advances in communication and transportation technologies. 

Many spatially embedded networks have been observed to follow 
gravity laws 37 , where the flow between two locations follows 

PiPj 



d a - 
u ij 



(1) 



Here, Ty is the flow between nodes i and j, P t and Pj are the popula- 
tions of nodes i and j, respectively and dy is the geodesic distance 
between i and j, the value of exponent a being dependent of the 
system. For the collaboration network Eq. 1 becomes 

SiSj 



w£ ol oc- 



(2) 



In Fig. 4B, we plot the ratio w\j°y (s z s ; j against the distance 

between all node pairs. We found that as the distance increases 

(w^° l j (siSj)^ decreases as a power law with the exponent a = 

1.16 ± 0.03 (R = -0.97 ± 0.002), except at very short distances. 
As we have seen before, collaboration and citation between two 
places are correlated. Hence, we also look at the geographical prox- 
imity in the citation network. We found that the probability that 
there is a link between two cities in the citation network also 
decreases with distance as a power law (Fig. 4C). In this case the 
power law exponent is much lower (0.30 ± 0.01). The gravity law 
for the citation network reads 



~out ~in 



(3) 



In Fig. 4D weplotvi£ lte y/ (s° ut s- n ^ against the distance between all the 
node pairs in the citation network. As for the collaboration network 
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Table 1 Dependence of citations on collaboration. We categorize each paper by the number of authors and their affiliations. For each of 
these groups we indicate the fraction of papers that are in the group and the mean number of citations. The error represents the standard 
error of the mean, calculated using bootstrap sampling with repetition 

NAuthors Papers 0 n %) Single City Multiple City Multiple Countries 


1 


13.03 


4.25 ± 0.02 


4.95 ± 0.12 


5.24 ± 0.1 1 


2 


19.01 


6.80 ± 0.02 


6.11 ± 0.04 


7.00 ± 0.05 


3 


18.34 


6.92 ± 0.02 


6.38 ±0.03 


7.30 ± 0.04 


4 


14.95 


7.19 ±0.02 


7.02 ± 0.03 


8.03 ± 0.04 


5 


11.10 


7.62 ± 0.03 


7.66 ± 0.03 


8.79 ± 0.04 


6 


8.01 


8.13 ± 0.04 


8.52 ± 0.05 


9.77 ± 0.05 


7 


5.20 


8.85 ± 0.05 


9.56 ± 0.07 


10.90 ± 0.07 


8 


3.45 


9.50 ± 0.07 


1 0.67 ± 0.09 


12.10 ± 0.10 


9 


2.22 


10.23 ± 0.10 


1 1.52 ± 0.12 


13.17 ± 0.12 


10 


1.53 


10.57±0.12 


12.45 ± 0.14 


14.70 ± 0.15 


>10 


3.17 


13.82 ± 0.17 


16.64 ±0.16 


21.37±0.17 



we found that (w^ tQ j ^s° ut sj n ^ ^ decreases with distance as a power 

law with the exponent a = 0.77 ± 0.02 (R = -0.35 ± 0.001). The 
above analysis shows the existence of an important spatial compon- 
ent in both the citation and the collaboration network. It shows that 
both our collaborators and our citations typically come from our 
spatial neighborhood. Further, long distance collaborations as well 
as citations decrease as a power law of distance. The difference of the 
scaling exponents of the two networks suggests that two distant 
places are more likely to cite each other than collaborate. 
Additional results are shown in the Supplementary Fig. S3 online. 

The research performance of each country is generally estimated 
on the basis of the number of publications and citations. Although 
these are straightforward measurements of research output, they 
depend on a wide spectrum of resources 39 . For instance, the number 
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Figure 4 | Effect of geographical proximity in the world collaboration and 
citation networks. The probability of existence of a link as a function of the 
distance between two cities in the (A) collaboration network and (B) 
citation network. Distribution of the ratio of the link weight and product of 
the strengths of its endpoints in (C) collaboration network, w^° l / S/S ; - and 
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cities. For each distance the average ratio is also shown. The solid line 
indicates a power law behavior with exponent ot = 1.16 ± 0.03 and 0.77 : 
0.02 respectively. 



of researchers and facilities (instruments, laboratories, libraries and 
other resources) available are typically different in different coun- 
tries. A key determinant is the funding available for research & 
development (R&D). To quantify the expenses in R&D of a country 
we consider the fraction of gross domestic product (GDP) that is 
spent on R&D. To get rid of economic inequalities in different coun- 
tries we consider the R&D spending in terms of the purchasing power 
parity (PPP). In Fig. 5 A, we plot the number of citations iV Cite against 
the R&D expenditure and find that it scales linearly with funding. 
Such correlation is not surprising, but the scaling exponent is non- 
trivial. It suggests that it is not possible to perform or contribute 
substantially unless there is a corresponding amount of funding 
available for research. Moreover, the research contribution in terms 
of citations also scales linearly with the number of researchers in that 
country (Fig. 5B). This result is consistent with the fact that the R&D 
expenditure is correlated with the number of researchers. The num- 
ber of publications of a country also shows similar scaling against 
R&D expenditure and number of researchers (Supplementary Fig. S4 
online). 

Finally as a measure of impact of a country's scientific output we 
consider the average number of citations to the publications of that 
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Figure 5 | Relation between research outcome and funding. Average 
number of citations per paper of a country against (A) the expenditure in 
research and development (in millions of dollars per year, and purchasing 
power parity) and (B) the number of researchers in that country. The solid 
line indicates power law scaling with exponent 0.99 ± 0.03 and 0.98 ± 0.04, 
respectively. (C) Average number of citations per paper of a country 
against the average spending per researcher. The horizontal line indicates 
the average number of citations over all papers of all countries, the vertical 
line indicates the threshold of about 100,000 $ per researcher per year. 
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country. In Fig. 5C we plot this number against the average spending 
per researcher per year (R&D expenditure divided by the number of 
researchers). The latter is not the average salary of researchers in that 
country, as it includes other expenditures such as infrastructure, 
bureaucracy, instruments, etc. This plot is much more scattered than 
the previous plots and does not show any definite correlation pattern. 
In order to identify groups of countries that behave similarly or show 
similar characteristics we use the /c-mean clustering technique 40 . By 
using this clustering method with k = 2, we found that the countries 
can be classified into two groups, one with average spending less than 
about 100,000 $ per researcher per year and other with average 
spending more than about 100,000 $ (Fig. S5 online). Another clus- 
tering methods also gives qualitatively similar results. This separa- 
tion in two groups, distinguished by the average spending per 
researcher per year (vertical line in the plot) also reveals another 
striking feature. If the average spending is less than about 100,000 
$ (vertical line in the plot) per researcher per year we see an increase 
in the average number of citations with the spending. However if the 
average spending exceeds this limit, it becomes scattered and inde- 
pendent of funding. This figure shows that very rich countries like 
Kuwait and Luxembourg have high funding per researcher, still the 
average number of citations per paper is low. Countries like India, 
Brazil have high funding per researcher as well, but low average num- 
ber of cites; this might mean they are investing more on infrastructure. 
Switzerland, Costa Rica, Panama, Germany, Austria, Netherlands, 
United States have high spending per researcher and their average 
number of citations is also high. If we display the number of cites 
per paper averaged over all countries (horizontal line), we see that 
there are no countries in the top left quadrant, i.e. it is not possible to 
do better than the world's average unless there is sufficient spending. 
Additional measures of a country's research performance and corres- 
ponding rankings are reported in the Supplementary Table SI online. 

Discussion 

Our thorough analysis of the world citation and collaboration net- 
works has revealed that the effects of geography on the dynamics of 
science are relevant, despite the recent advances in communication 
and transportation. The occurrence of gravity laws for both citation 
and collaboration implies a preference by scientists to interact with 
peers in their geographic areas. However, long-distance interactions 
are not rare, as the interaction strength and probability are charac- 
terized by power law decays. Our work follows similar findings in 
mobile phone communication 1,2 , social media 3 and international 
trade 41 , reinforcing the belief that gravity laws hold in several differ- 
ent contexts, and that scientific interactions are not exceptional from 
this point of view. Thus, the gravity law is a fundamental relationship 
holding also in human dynamics. 

Citation and collaboration streams between distinct locations are 
strongly correlated, with an approximately linear relation. An 
increase in the number of collaborations between two cities is then 
expected to be followed by a proportional increase in the flow of 
citations between the cities. This is justified from the fact the peo- 
ple/groups working in similar fields and subject area are more likely 
to cite as well as collaborate with each other, and also suggests a 
natural bias towards self- citation, of which we have provided strong 
quantitative evidence. 

From the point of view of scientific impact, it pays off for a team to 
put together several institutions with a strong international par- 
ticipation. While part of this effect could be justified by the fact that 
having people from different locations facilitates the circulation of a 
work, which then becomes more visible and susceptible to be cited, 
the trend indicates that it is more likely to produce high quality work 
through international collaborations. It would be valuable to be able 
to disentangle the impact due to social networking from that due to 
the quality of the paper. Our findings pave the way for the first 
quantitative assessment of this issue. As a consequence, we expect 



to observe an increasing tendency to form large teams with members 
of many different countries in the future. 

We also disclose a striking effect in the relationship between the 
national expenditure per researcher and the impact of the scientific 
output of a country. If the average spending per researcher per year is 
low, it is impossible for a country to do better than the world average, 
in terms of the average number of cites per paper. So there is a 
minimal funding quota that needs to be exceeded if a country wishes 
to have a scientific output of high average quality. Exceeding the 
threshold, however, does not guarantee success. This suggest that 
in science money acts as a kind of threshold motivator: if one does 
not pay people enough they will not be motivated and the outcomes 
of the research are poor; if people are paid sufficiently to take the 
issue of money off the table, internationally competitive findings are 
within reach. On the other hand, for conceptual and creative tasks, 
paying more than a certain threshold does not necessarily increase 
the output 42 " 44 . Further, our analysis reveals that at the country level 
funding has a positive linear impact on the research output both in 
terms of number of publications as well as citations. Thus, it is not 
possible for a country to increase its research output substantially 
without a sizeable increase in investments. 

In the future we plan to study the role of cities' population, in 
particular on the distributions of citation and collaboration strengths 
along with their flows. It is well known that most characteristics of 
cities are strongly correlated to the size of their populations 45 . 
Furthermore, an analysis of the evolution of the world citation and 
collaboration networks would show how the spatial dimension of 
science dynamics has been affected by the progress of technology, 
internationalization and extreme events (e.g. wars, economic crises). 
This way one could infer how the scientific landscape has been shap- 
ing up in the last decades and how it is possible to create more 
efficient partnerships, via dedicated funding programs at the national 
and/or international level, and consequently a more productive and 
successful scholarly world. 

Methods 

Data description. We have analyzed all publications (articles, reviews and editorial 
comments) written in English from 2003 till the end of 2010 included in the database 
of the Institute for Scientific Information (ISI) Web of Science. For each publication 
we extract the affiliations of the authors and the corresponding citations to that 
publication. We parsed the affiliations of all publications and have determined the 
geographic location at the city and country level. If there are multiple affiliations listed 
in a publication, the latter is associated with all represented cities and countries. After 
obtaining the locations we use the publicly available resources (www.wikipedia.org 
and maps.google.com) to determine their coordinates (latitude and longitude). Our 
dataset consists of 8,094,948 publications which have received 62,105,592 citations 
during the period 2003-2010. We were able to extract the geographical information 
from 8,092,314 publications. Affiliations refer to 226 countries and 37,750 cities. In 
order to get rid of anomalies due to any misclassification, we have only considered 
those places that have appeared in at least 5 publications during the period 2003- 
2010. This cutoff led us to 18,199 cities, producing 99.8% of the total publications and 
receiving 99.9% of total citations. 

Country level information regarding expenditures for research and development 
(R&D) in terms of purchasing power parity (PPP) and number of researchers in R&D 
are obtained from the World Bank Data (databank.worldbank.org) for each year 
between 2003 till 2010. By aggregating these yearly datasets we determine the average 
of each of the above quantities for the period 2003-2010. The data of expenditure for 
R&D is available for 102 countries, the numbers of researchers for 89 countries and 
for 77 countries both datasets are available. Further details can be found in the 
Supplementary Methods online. 

Network construction. We have analyzed the data at the country and the city level. 
As the publications and their affiliations form a bipartite graph, we construct the 
collaboration network between countries (cities) by projecting it onto the space of 
affiliations. In this collaboration network individual countries (cities) act as nodes, 
and links between them indicate that they have appeared in the same publication. If a 

paper is written by authors with n affiliations, we put -nx(n-l) undirected links 

between each possible pair of collaborating countries (cities), with every link having 
2 

weight — ^ jT- The total weight between any pair of nodes is the sum of all the 

weights over all the publications in the dataset. If there is a single affiliation in a 
publication then we put a self-link with weight 1 . 
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In the citation network between countries (cities) nodes are papers which are 
linked if one paper cites the other. If a paper written by authors with n affiliations cites 
a paper written by authors with m affiliations we put nX m directed connections from 
each of the n citing countries (cities) to each of the m cited countries (cities), every link 
having weight l/(nm). The total weight of a directed link between two countries 
(cities) is the sum of all the weights over all the citations in the dataset. Since there can 
be multiple affiliations from the same country (city) in a publication, there are self- 
loops both in the world citation and in the world collaboration networks. 

Great-circle distance. The geodesic or the great-circle distance is the shortest 
distance between any two points on the earth measured along a path on the surface of 
the earth. Given the latitudes and longitudes of two points, we have used the 
Haversine formula to calculate the great-circle distance between them 46 . In these 
calculations, we considered the earth's radius to be 6372.8 KM. 
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